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DETAILED ACTION 

1 . This communication is in response to the Application filed on 01/21/2004. Claims 
1-29 are pending and have been examined. 

Information Disclosure Statement 

2. The information disclosure statement (IDS) submitted on 01/21/2004 is in 
compliance with the provisions of 37 CFR 1 .97. Accordingly, the information disclosure 
statement is being considered by the examiner. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1-4, 7, 9-12, 15, 17, and 29 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Hon et al. (US 5,680,510) in view of Huang et al. ("Whistler: A 
trainable Text-to-Speech System", 1996). 

As to claims 1 , 9, and 29, Hon et al. discloses 

a speech processing system adapted to receive an input related to one of 
speech and process the input (see col. 4, line 47) to provide an output related to 
one of text (see col. 4, lines 46 and col. 9, lines 67-col. 10, lines 1-8), the speech 
processing system (see col. 9, line 65) accessing a module (see col. 5, lines 30- 
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33) (e.g. The accessing of a storage area of possible phones is seen.) derived 
from a phone set having a plurality of phones for a tonal language (see col. 4, 
lines 30-35 and col. 6, 44-48) (e.g. The final part of a syllable consists of two or 
fewer phones as is inherent in the Mandarin Chinese language (see col. 2, lines 
1-5)), the phones being used to model syllables used in the module (see col. 6, 
lines 4-5), the syllables having an initial and final part (see col. 6, lines 4-5), 
wherein the final part comprises a plurality of phones (see col. 2, lines 3-4) (e.g. 
There can be multiple phones existing for the final component of a syllable) that 
jointly and implicitly carry the tonal information (see col. 6, lines 53-63 and col. 7, 
lines 65-col. 8, lines 1-21) (e.g. It is seen by the reference that the tonal 
information is dependent upon the initial, final, or a combination of the two). 

However, Hon et al. does not specifically disclose the input being text and 
the output being speech. 

Huang et al. does disclose the conversion of text to speech from learning 
methods of model parameters (see Abstract). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. and include a text to speech converter taught by Huang et al.. The 
motivation to have included such an element is to have an alternative means for 
inputting as well as producing a synthesized speech output based upon model 
parameters of the system (see Huang et al., Abstract) as would benefit the 
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system of Hon et al. by using the tone related information as output speech for 
producing speech resembling the user. 

As to claims 2 and 10, Hon et at. discloses wherein 

each phone of the final part includes information about the tone (see col. 
6, lines 53-63 and col. 7, lines 65-col. 8, lines 1-21). 

As to claims 3, 4, 11, and 12, Hon et al. discloses wherein 

the tonal language comprises a plurality of different tones with different 
levels of pitch (see col. 6, lines 55-56) (e.g. The Hon et al. reference discloses 
the use of two tones in the example. It is obvious to one of skilled in the art that 
these tones represent either high or low tones). 

As to claim 7 and 1 5, Hon et al. discloses wherein 

each syllable comprises the same form having the initial and the final, the 
final having two phones carrying partial tonal information each (see col. 6, lines 
53-63 and col. 7, lines 65-col. 8, lines 1-21) (e.g. Since the final can possess 
diphthong or two phones, the tonal information being dependent on the initial, 
final, or a combination of the two). 



As to claim 17, Hon et al. discloses wherein 
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the tonal language comprises Chinese or a dialect thereof, such as 
Cantonese (see Abstract). 

5. Claims 8 and 16 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Hon et al. (US 5,680,510) in view of Huang et al. as applied to claims 1 and 9 above, 
and further in view of Chen et al. (Us 5,751 ,905). 

As to claims 8 and 19, Hon et al. and Huang et al. do not specifically disclose the 
syllables of the tonal language include a glide, which is embodied in the initial. 

However, Chen et al. does disclose the glide being included and 
embodied in the initial (see col. 5, lines 42-45). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. and Huang et al. with the inclusion of a glide as the initial taught by 
Chen et al. The motivation to have included the element involves the reduction in 
the number of phonemes and reduces the context dependency of the consonants 
(see Chen et al., col. 4, lines 42-46). , 

6. Claims 5 and 1 3 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Hon ef al. (US 5,680,51 0) in view of Huang et al. as applied to claims 1 and 9 above, 
and further in view of Akinlabi etal. ("tonal Phonology of Yoruba Clitics"). 

As to claims 5 and 13, Hon et al. and Huang ef al. disclose the phone being 
associated with a categorical level. 
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However, they do not specifically disclose the levels of pitch comprising 
five categorical levels. 

Akinlabi et al. discloses three types of tones being associated 
phonemically (see page 2, sect. 2, lines 1-2). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. and Huang et al. with three categorical levels taught by Akinlabi et 
al. The motivation to have included five categorical levels involves the inclusion 
of other tone languages such as Yoruba, where three tones are present (see 
Akinlabi era/., page 2, sect. 2, 1 st paragraph) as would benefit the teachings of 
Hon et al. to include other tonal languages using tonal information. 

7. Claims 6, 14, 18, and 19 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Hon et al. (US 5,680,510) in view of Huang et al. as applied to claims 
1 and 9 above, and further in view of Chen ("recognize Tone Languages Using Pitch 
Information on the Main Vowel of Each Syllable"). 

As to claims 6, 14, and 18, Hon et al. and Huang et al. disclose the phone being 
associated with a categorical level. 

However, they do not specifically disclose the levels of pitch comprising 

five categorical levels. 

Chen discloses the use of five pitch levels (see page 4, sect. 7.1, lines 1- 

3). It would have been obvious to one of ordinary skilled in the art at the time the 
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invention was made to have modified the speech processing system taught by 
Hon et al. and Huang et al. with five categorical levels as taught by Chen et al.. 
The motivation to have included five categorical levels involves the inclusion of 
other tone languages such as Thai, where five tones is present (see Chen et al., 
page 4, sect. 7.1). 

As to claim 1 9, Chen discloses 

the tonal language comprising Vietnamese (see page 4, sect. 7.2). 

8. Claims 20, 21 , and 24-26 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Hon et al. (US 5,680,510) in view of Huang et al. and in view of Chen 
etal. (US 5,751,905). 

As to claims 20 and 21 , Hon et al. discloses 

a speech processing system adapted to receive an input related to one of 
speech and process the input (see col. 4, line 47) to provide an output related to 
one of text (see col. 4, lines 46 and col. 9, lines 67-col. 10, lines 1-8), the speech 
processing system (see col. 9, line 65) accessing a module (see col. 5, lines 30- 
33) (e.g. The accessing of a storage area of possible phones is seen.) derived 
from a phone set having a plurality of phones for a tonal language (see col. 4, 
lines 30-35 and col. 6, 44-48) (e.g. The final part of a syllable consists of two or 
fewer phones as is inherent in the Mandarin Chinese language (see col. 2, lines 
1-5)), the phones being used to model syllables used in the module (see col. 6, 
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lines 4-5), the syllables having an initial and final part (see col. 6, lines 4-5), 
wherein the final part comprises a plurality of phones (see col. 2, lines 3-4) (e.g. 
There can be multiple phones existing for the final component of a syllable) that 
jointly and implicitly carry the tonal information (see col. 6, lines 53-63 and col. 7, 
lines 65-col. 8, lines 1-21) (e.g. It is seen by the reference that the tonal 
information is dependent upon the initial, final, or a combination of the two). 
Further Hon et al. discloses the different tones with different levels of pitch (see 
col. 6, lines 55-56) (e.g. The Hon et al. reference discloses the use of two tones 
in the example. It is obvious to one of skilled in the art that these tones represent 
either high or low tones). 

However, Hon et al. does not specifically disclose the input being text and 
the output being speech. 

Huang et al. does disclose the conversion of text to speech from learning 
methods of model parameters (see Abstract). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. to include a text to speech converter as taught by Huang et al. . The 
motivation to have included such an element is to have an alternative means for 
inputting as well as producing a synthesized speech output based upon model 
parameters of the system (see Huang et al., Abstract) as would benefit the 
system of Hon et al. by using the tone related information as output speech for 
producing speech resembling the user. 
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Hon et al. and Huang et al. do not specifically disclose the glide 
dependent initials 

However, Chen et al. does disclose the glide being included and 
embodied in the initial (see col. 5, lines 42-45). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. and Huang et al. to include glide dependent initial as taught by 
Chen et al. Further, the motivation to have included the glide embodied in the 
initial involves the reduction in the number of phonemes and reduces the context 
dependency of the consonants (see Chen et al., col. 4, lines 42-46). 

As to claims 24 and 25, Hon ef al. wherein 

each syllable comprises the same form having the initial and the final, the 
final having two phones carrying partial tonal information each, (see col. 6, lines 
53-63 and col. 7, lines 65-col. 8, lines 1-21). 

As to claim 26, Hon et al. discloses wherein the tonal language comprises 
Chinese or a dialect thereof, such as Cantonese (see Abstract). 

9. Claim 22 is rejected under 35 U.S.C. 103(a) as being unpatentable over Hon et 
al. (US 5,680,510) in view of Huang et al. and in view of Chen ef al., as applied to 
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claims 1 and 9 above, and further in view of Akinlabi et al. ("Tonal Phonology of Yoruba 
Clitics"). 

As to claim 22, Hon et al. and Huang et al. discloses the phone being associated 
with a categorical level. 

However, they do not specifically disclose the levels of pitch comprising 
five categorical levels. 

Akinlabi etal. discloses three tones being associated phonemically (see 
page 2, sect. 2, lines 1-2). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al, Huang et al., and Chen et al. with three categorical levels as taught 
by Akinlabi et al.. The motivation to have included five categorical levels involves 
the inclusion of other tone languages such as Yoruba, where three tones are 
present (see page 2, sect. 2, 1 st paragraph) as would benefit the teachings of 
Hon et al. to include other tonal languages using tonal information. 

10. Claim 23, 27, and 28 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Hon et al. (US 5,680,510) in view of Huang et al. and in view of Chen et al. (US 
5,751,905) as applied to claim 20 above, and further in view of Chen ("Recognize Tone 
Languages Using Pitch Information on the Main Vowel of Each Syllable"). 

As to claims 23 and 27, Hon et al. and Huang et al. disclose the phone being 
associated with a categorical level. 
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However, they do not specifically disclose the levels of pitch comprising 
five categorical levels. 

Chen discloses the use of five pitch levels (see page 4, sect. 7.1, lines 1- 

3). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the speech processing system taught 
by Hon et al. Huang et al., and Chen et al. with five categorical levels. The 
motivation to have included five categorical levels involves the inclusion of other 
tone languages such as Thai, where five tones are present (see Chen, page 4, 
sect. 7.1). 

As to claim 28, Chen discloses 

the tonal language comprising Vietnamese (see page 4, sect. 7.2). 

Conclusion 

1 1 . The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Lee (US 5,220,639) is cited to disclose inputting of Chinese characters into a 
computer and recognizing syllables and tones. Chen et al. (US 6,510,410) is cited to 
disclose recognition of tone languages. Zhang et al. (US 6,553,342) is cited to disclose 
a tone based speech recognition using feature vectors 
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The NPL document by Wang et al. ("Complete Recognition of Continuous 
Mandarin Speech for Chinese Language with Very Large Vocabulary using Limited 
Training Data") is cited to teach modeling of sub-syllable models for tone recognition. 
Lee et al. ("Tone Recognition of Isolated Cantonese Syllables") is cited to disclose using 
neural networks for tone recognition. Wang et al. is cited to disclose recognition of 
Mandarin speech. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Paras Shah whose telephone number is (571)270-1650. 
The examiner can normally be reached on MON.-THURS. 7:30a.m.-4:00p.m. EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571)272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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